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of  runs.  The  special  case  of  a process  in  which  the  variables  take  on 
only  two  values  is  useful  as  a model  for  the  counting  process  in  a 
discrete-time  point  process.  An  application  to  the  modelling  of  errors 
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1.  IWTBDIXrCTION 


In  this  paper  we  will  introduce  a slnple  method  for  obtaining 
a stationary  sequence  of  dependent  random  variables  having  a specified 
marginal  distribution  and  correlation  structure  (second-order  joint 
moments).  One  advantage  of  the  model  is  that  the  specification  of  these 
two  aspects  of  the  model  is  independent.  Another  advantage  is  that  the 
sequence  is  obtained  as  a very  simple  transformation  of  a sequence  of 
independent  random  variables.  The  model  is  analogous  to  models  for 
dependent  sequences  of  exponential  random  variables  introduced  in  Jacobs 
and  Lewis  (197?)  and  Lawrance  and  Lewis  (1977). 

We  will  now  define  some  quantities  idiich  will  be  used  throughout 
the  paper.  Let  (Y^)  be  a sequence  of  independent  random  variables  taking 
values  in  a discrete  space  E each  having  the  distribution  tt.  Let 
{U^}  and  (V^)  be  independent  sequences  of  independent  random  variables 
taking  the  values  0 and  1 with 

(1.1)  P{Ujj  * 1)  = P and  P{V^  = 1}  = p 

for  fixed  0 < p < 1 and  0 < p < 1.  Let  (3^^)  be  a sequence  of 
independent  identically  distributed  random  variables  taking  the  values 
0,1,2,...,N  with  distribution  F,  where  N is  a fixed  integer. 

The  most  general  case  which  we  will  consider  here  is  a sequence 
of  random  variables  {X^}  which  is  formed  according  to  the  probabilistic 
linesu'  model 

1 


- A* 


for  n = 1,2, , where 

^^•5)  A = V A - + (1-V„)Y„. 

n n n-1  n n 

The  model  of  (1.2)  and  (1.3)  will  be  termed  DARMA(l,lf«-l),  (discrete 
mixed  autoregressive-moving  average  process  with  autoregression  of 
order  1 and  moving  average  of  order  If*-1). 

If  we  start  the  process  with  having  the  distribution 

7T  independent  of  {Y^;  n>  -N),  (U^)> 

n = 1,2,...,  will  be  shown  to  form  a stationary  sequence  of  dependent 
dj  screte  random  variables  having  marginal  distribution  it.  This 
stationary  sequence  is  in  general  not  Markovian,  although  it  will  be 
so  if  p = 0.  Its  correlation  structure  is  determined  by  the  parameters 
p and  p and  the  distribution  F.  Note  that  v can  be  any  distribution. 
Some  cases  of  discrete  distributions  of  particular  interest  are  obtained 
by  choosing  tt  to  be  geometric  or  Poisson. 

. Certain  special  cases  of  the  ElAHMA(l,lH-l)  process  are  of 

particular  Interest  and  their  consideration  will  make  the  nomenclature 
"ilear. 

(i)  The  DAR(1)  process. 

^ If  P = 0,  then 

{^n-(»H)-l  probability  p, 

Yn  with  probability  (l-p). 

I 


I 

1 

] 


(1.4) 


\ \-(NH) 


M.1 


{Aij}  is  called  the  DAR(l)  process  (discrete  autoregressive  process  of 
order  1) . 

(ii)  The  DMA(N)  process 

If  3 = 1,  then  = Y^_g  = with  probability  F(k)  for 

n 

k = 0,1,  ...,N  vdiere  F is  the  distribution  of  S . In  this  case, 

n 

(Xn)  is  called  a DMA(N)  process  (discrete  moving  average  process  of 
order  N).  Note  that  if  {X^}  is  a DMA(l)  process,  then 
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\ 

I 

sr 


1 ^n  probability  F(0), 

X = < 

with  probability  1 - F(0). 

(iii)  The  DARMA(l,  1)  process 

Finally,  if  N = 0,  then  (X^)  will  be  termed  a MRMA(l,  1) 
process  (discrete  mixed  autoregressive-moving  average  process  both  of 
order  1)  with  parameters  3 and  p;  that  is, 

I with  probability  3, 

^n  = 

with  probability  (l-3)  . 

Note  that  the  DMA(l)  process  is  a special  case  of  the  DARMA(1,1) 
process  when  p = 0. 

(iv)  Independent  process 

If  p = 1,  N = 0 or  3 = 0,  p = 0,  then  (X^)  is  a sequence 
of  independent  random  variables  with  comnon  distribution  TT. 

The  model  of  (1.2)  and  (I.3)  is  really  the  backward  DARMA  model. 
The  forward  model  is  aefined  in  a similar  fashion.  However,  the  two, 
while  similar,  are  net  necessarily  equivalent.  This  is  because  (X^) 
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is  not  in  general  time  reversible  in  the  sense  that  {X^^, . . will 

not  in  general  have  the  same  distribution  as  ^-k+1'  ’ * * ’ ^-1^ ' 

The  properties  of  one  model  can  be  derived  by  the  same  techniques  as 
those  of  the  other,  so  we  will  only  consider  the  backward  model. 

Note  that  the  DAGMA  process  may  be  defined  using  any  sequence 

of  independent  identically  distributed  random  variables  not 

necessarily  discrete.  However,  (1.2)  and  (1.5)  show  that  {X  ) is 

n 

obtained  as  a mixture  of  the  {Y^}  sequence.  As  a result,  even  if 
the  distribution  of  is  continuous,  a realization  of  the  sequence 
(Xj^)  will  In  general  contain  many  runs  of  a single  value.  This  seems 
to  be  the  major  drawback  to  using  this  scheme  to  obtain  a sequence  of 
dependent  random  variables  with  a specified  continuous  marginal  distri- 
bution and  correlation  structure.  Other  schemes  for  obtaining  sequences 
of  dependent  exponential  and  gamma  random  variables  have  been  proposed 
which  look  more  promising;  cf.  Lawrance  and  Lewis  (1977),  Gaver  and 
Lewis  (1977 J,  and  Jacobs  and  Lewis  (1977). 

One  motivation  behind  the  DARMA  models  was  to  provide  a simple 
scheme  for  obtaining  models  with  which  to  analyse  stationary  sequences 
of  dependent  discrete  random  variables  with  specified  marginal  distri- 
bution and  correlation  structure.  In  general,  there  is  not  much  beyond 
a Markov  chain  model  which  is  overparametrized  for  statistical  purposes 
for  modelling  dependent  sequences  of  random  veuriables.  In  addition  it 
is  very  often  simple  to  show  from  data  that  the  correlation  structure 
of  the  sequence  is  not  Markovian.  The  OARMA  model  can  be  used  to  model 
’lonMarkovian  sequences  of  discrete  random  variables;  an  observed 
sequence  of  this  kind  is  discussed  in  the  last  section. 


T-w»r=r-..-.X-  *■ 


I . 

t , 

I 

Another  motivation  for  the  development  of  this  process  was  to  I , 


provide  models  for  point  processes  in  which  the  data  is  given  in  terms 
of  counts  in  fixed  time  intervals  rather  than  the  exact  times  of  arrivals. 
Most  models  for  point  processes  beyond  the  Poisson  process  are  most 
easily  described  in  terms  of  times  of  arrivals  or  times  between  arrivals 
and  it  is  often  hard  to  obtain  results  concerning  the  joint  distribution 
of  counts  in  different  fixed  time  inteirvals.  We  feel  that  the  DARMA 
models  will  be  of  use  in  such  situations.  There  is  also  the  possibility 
of  modelling  directly  the  binary  counting  process  in  a discrete  time 
point  process.  This  is  discussed  in  Section  6. 


2.  SOME  PRELIMIUm  PBOPERTIES  OF  THE  nABMA(l,»H)  PROCESS 


In  this  section  we  will  give  some  properties  for  the  DARMA(l,N+l) 
process.  Unless  otherwise  indicated  we  will  assume  throughout  the  paper 
that  A has  a distribution  ir  and  is  independent  of 

{V^)j  and  {S^). 

2.1,  The  marginal  distribution  of  X^. 

We  will  first  show  that  X as  defined  by  (1.2)  has  distribution 

n 

7T  for  all  n.  To  this  end  we  note  from  the  expression  (1.3)  that  the 
random  variable  A^  can  be  expanded  backwards  to  the  initial  value 
A give  A^  = with  probability  p^(l-p)  for  0 < J < N+n 

and  A^  = A with  probability  that  is,  A^  is  a mixture 

of  Y^,  ...  , Y_jj,  and  Hence,  for  i in  the  state 

space  E 

P(A^=i)  = V 


for  n = -N,  -Hfl,  ...  . Similarly, 


for  i e E and  n = 1,2,  ...  . Hence,  the  marginal  distribution  of 

the  X 's  like  those  of  the  Y^'s  is  ir. 
n n 


2.2.  Correlational  properties  of  {X^^) 

Although  the  X ' s have  a stationary  distribution  tt,  the  X * s 
n ^ 

are  not  independent>as  are  the  Y^^'s.  This  can  be  seen  by  the  following 
calculation  of  the  covariance  between  X^  and  After  some 

simplification 


■ VsJ  - 

+ e(i-e)  (*(Vj-3  . . V(i»i)^  ■ ®^V3-s  ' 

n+j  n"^j 

+ p(l-p) 

+ (1-p)^  t®[^n+j-(N+l)  ^n-(N+l)^  " ®^\+j-(N+l)^  ®^-^n-(N+l)' 


The  covariance  of  Y„  „ and  Y . _ is 

n-Sjj  n+l-S^^j^ 

«vs„i 


= E P{S  ^T=k)  P{S  =k+l)  Var  Y„  _ 
k=0 


■*A- 


In  the  case  p = 1,  (the  DMA(N)  model),  only  the  first  term  In  (2.1) 
is  nonzero  and  we  ce«i  get  the  correlation  from  the  above  result. 
Putting  F(h)  = P{S^=lc)  we  have  for  the  correlation 


' ^n+l-S 
n n+l 


= lE[y 


IH-I-S 


n+l  ‘^n 

N-1 

= Z r(k)  F(k+1)  . 

k=0 


By  similar  reasoning,  for  j < N 


\-S  > ' "Vs 


n+l 


(2.3)  P^(3)  - 


and  for  J > N,  Pjj^(d)  = 0.  Note  that  these  expressions  do  not  depend  on 
n and  thus  the  DMA(N)  process  is  second  order  covariance  stationary. 

We  will  now  compute  the  covariance  of  and  which 

appears  in  (2.1);  this  will  incidentally  give  us  the  correlation  structure 
of  the  DAR(l)  process, 

- - ‘‘'Vl!  "Vl” 

* a-p)  E(A„.i]) 

= p Var 


s 


I 


MiMiiaamiiiiliiiiibMaiiid 


are  Independent. 


since  the  second  term  Is  zero  because  Y and  A . 

n n-l 

This  is  because  A , is  a function  only  of  Y , , Y By  an 

n-l  n-l  n-2 

induction  argument  we  obtain  for  the  correlation  of  A^^  and  A^^ 


(2.4) 


^(d)  = corr(A^,A^^j)  = 


= J 


for  j > 1.  Because  of  the  assun^tion  that  has  distribution  v, 

(2.U)  does  not  depend  on  n and  thus  the  autoregressive  process  is 

second-order  covariance  stationary. 

To  cos^lete  the  result  for  the  general  QABMA(l,Ift-l)  process, 

we  compute  the  cross  covariances  between  the  sequences  (Y^  g } and 

"“n 

{A^} . We  obtain 


(2.5)  \-(N+l)^  “ ®^\-(lM) 


] = 0 


for  d > 1 since  . only  takes  on  the  values  {0,1,  ...,N).  For 
— ttt’d 

0 < d < N 

(2.6)  E[Yjj_g  - E[Yjj_g  ] E[A^_  ] 

n n 

N 

= E E[Y  .A  S =k]  - E[Y  .;S  =k]  E[A 

n-k  n-N+d  n n-k'  n ^ n-N+d 

= [(l-p)  P(»-d)  + p(l-p)  F(l»-d+l)  + ...  + pJ(l-o)  F(H)]  Vir(Y^) 


For  j > N 


(2.7) 


of  X 

n 


We  have 

(2.8) 
For  1 ; 
(2.9) 


N 


= [p^'"(l-o)  F(0)  + p^"'**’'(l-p)  F(l)  t •••  + p^(l-p)  r(l()]v«t(Yj) 


Putting  everything  together  in  (2.1)  we  obtain  the  correlation 


yj)  = (E[X„X^^j]  - E[X^]  E[X^^j])Aar  X,  . 


0.(1)  = Z F(k)  F(fcH)  + p(l-p)  F(N)  (1-p)  + (l-p)^p  • 

^ k=0 

d < N 
0^(J) 

= Z F(k)  F(k+d) +p(l-p)(l-p){F(N-j+l) +pF(N-j+2)+...+  p^"^(N)) 
k=0 

+ (l-p)^p^  . 
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(2.10)  pj^(j)  = p’^'^"^te(l-p)(l-p)[F(0)+pP(l)+...+p“p(N)]  + (l-p)^  p*^^} 


Note  that  0 < pjj(j)  < 1 and  for  j > N,  p^^(j)  decreases 
geometrically  if  p > 0 and  p < 1.  Since  Pj^(J)  is  Independent  of  n, 
the  MHMA(1,N+1)  process  is  second  order  covariance  stationary. 


2.3.  Invariance  under  treuisformations 

From  its  definition  we  note  that  X is  a mixture  of  the  random 

n 

variables  . . . , Y_j,  and  l.e.,  it  is  a random 

selection  of  one  and  only  one  of  these  random  variables.  Thus,  if  we 
transform  each  of  the  random  variables  Y^^,  ...  , ^-(h+1) 

the  same  function,  each  will  be  transformed  in  the  same  way  and 

its  distribution  will  be  that  of  the  transformed  Y^'s. 

Slmileu:  remarks  apply  if  we  transform  the  ^^'s.  Note  that  in 
applying  a common  transfonoation  individually  to  the  X^^'  s we  do  not 
affect  the  selection  procedure  and  therefore  the  correlation  structure 
of  the  transformed  process  is  the  same  as  that  of  the  untremsformed 
process.  This  (marginal)  transformation  invariance  is  important  for 
statistical  analysis  of  the  process. 


3.  THE  AUTOREGRESSIVE  PROCESS  DAR(l) 


In  this  section  we  will  give  some  properties  of  the  DAR(l) 
process  {A^) . As  usual  we  will  assume  that  A distribution  ir. 

By  the  results  of  Section  2,  {A^}  is  a stationary  sequence  of  rtmdom 
variables  with  marginal  distribution  tt  and  correlations 


(5.1) 


Pfl(j)  = > J > !• 


The  spectrum  of  the  process  is  thus 


(3.2) 


f(u3)  = ^ {1  2 Z p (j)  cos(<Pj))  = ^ 

^ J=1  1 + p*^  - 2p  cos  03 


It  follows  from  (1.3)  that  (A^)  is  a Markov  chainj  that  is, 

P{A  = i|A-,...,A^)  = P{A  _ = i|A  ) for  any  i in  the  state  space  E. 
n+i  i-  n n+i  n 

Further,  it  is  not  hard  to  show  that  the  transition  matrix  P is 
given  by 


(3.3) 


P{A„+1  = i|A^  = k)  = P(k,i) 


(l-p)7r(i),  for  k ^ 1, 
p+  (l-p)Tr(l)  for  k = i. 


V 

Note  that  we  have  started  from  the  opposite  direction  from  that 
usually  taken  in  Markov  chain  theory;  we  have  specified  the  stationary 
distribution  associated  with  the  chain  first  and  specified  the  (Markovian) 
dependency  structure  by  a single  parameter  p.  Moreover  changing  p 
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4 


does  not  affect  tt.  When  p = 0 we  have  a stationary  sequence  of 
Independent  identically  distributed  random  variables  with  distribution  tt. 

The  fact  that  (A^)  1s  a Markov  chain  with  a particularly  single 
transition  function  P makes  many  calculations  quite  easy.  For  example, 
in  discrete  time  series,  runs  of  given  values  of  the  random  variables 
are  useful  in  statistical  analyses.  Properties  of  these  runs  are 
easy  to  obtain  for  the  DAR(l)  process.  Thus  fix  a state  i £ E and 
let  T^  = inffn  > ^ i)  ^ 1}  T^  is  the  length  of  a run  of  i's 

starting  at  time  1,  where  length  can  be  0,1,...  . Tlien 

P{T^  > n}  = P{A^  = Ag  = •••  = Ajj  = i]  = 7i(i)  P(i,i)'^'^ 
for  n > 1 and  P{Tj^=0)  = 1 - 7r(i).  Thus 

" 1 - P(l,l)  " (l-p)[l-i’(i)j  ■ 

If  p = 0,  then  A^  = for  n > 1 E[T^]  = 7r(i)/ll-7r(i)  ] 

as  expected  since  {A^^}  is  a sequence  of  independent  random  variables 
in  this  case.  Note  that  for  0 < p < 1 

that  is,  the  expected  length  of  a run  of  i 's  for  a DAR(l)  process  is 
always  greater  than  or  equal  to  the  e3q>ected  run  length  for  a sequence 
of  independent  rauidom  variables.  Moreover  the  inflation  in  the  expected 
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length  of  runs  Is  uniform  for  all  states.  This  Is  a consequence  of 
the  fact  that  we  are  dealing  with  a one  parameter  Markov  chain. 

It  Is  also  not  hard  to  calculate  the  generating  function  for 
We  have  for  0 < z < 1 

00  00 

«(z)  = Z z“p{T  =n}  = [l-7r(l)]  + Z z“[P(l,l)]“"^  [1-P(l,l)] 
n=0  n=l 

= [1-7t(1)]  + 

Again,  If  0 = 0,  ®(z)  reduces  to  ^ the  eiqpresslon  for  the 

generating  function  of  a length  of  run  of  1 for  a sequence  of 
Independent  random  variables  with  marginal  distribution  ir. 


!rw 


4.  THE  DMA  PROCESS 


In  this  section  we  will  consider  the  DMA(n)  process  X = Y 


h " n-S  ‘ 
n 


Note  that,  unlike  the  DAR(i)  process,  the  DMA(n)  process  is  not  Markovian 
in  general. 


l;.l.  Correlation  properties. 

By  results  in  section  2,  {X^)  is  a stationary  sequence  of  random 
variables  with  marginal  distribution  it  eind  correlations 


N-J  N 

p„.(j)  = corr(X  ,X  ) = E F(k)  F(k+j)  = E f(v)  F(v-j) 
MA  k=0  v=j 


for  1 < j < N.  Also  =0  for  j > N and  P,^(o)  = 1.  Note 

that  when  N = 1,  the  maximum  value  of  the  first  order  serial  correlation, 

max  P„. (l)  = max  {f(o)  [1-F(o)])=  1/4.  In  fact  one  can  show  that 
F(0)  ^ F(0) 

for  ainy  N > 1 the  maximum  first  order  serial  correlation 
that  can  be  achieved  is  l/4.  One  can  also  maximize  the  correlation 
at  any  point,  say  j,  by  making  F(j)  = F(o)  = 1/2.  However,  all  the 
other  correlations  are  zero. 

For  the  spectrum  of  the  DMA(n)  process  we  have 


(4.2) 


00 


I 


1 

I 


T 


f(“)  =^{1^-2 


Z Pj^(j)  C08(0>  )); 

J=1  ^ 


then  if  we  define  p (-j)  = p (j),  we  have 

MA  MA 


+00  . N 

(4.3)  f(o>)  = — Z e^^*’  Z F(v)  F(v-j) 

j = -oo  v=lJl 

= ^[(^  F(j))(  Z e^^-^Flj))  + 1 - Zf(j)^] 

J=0  j=0  j=0 


i 

I 

> 

Nt' 


= ^ + 1 - Z F(J)^] 

2TT  S S 

= ^ [ lcPo(“)l^  -t-  1 - Z F(j)^] 

where  (Pg  is  the  characteristic  function  of  the  distribution  F of 
the  random  variable  S.  Thus  we  can  model  a broad  class  of  spectra 
f(u)).  If  f(o)  = 1 we  have  an  independent  identically  distributed 
sequence  and  a flat  (constant)  spectrum. 

By  way  of  example,  it  is  worth  noting  that  we  have  restricted 
S to  have  finite  support.  Then  (4.3)  is  a polynomial  in  cos  o)  Just 
like  any  moving  average  process.  The  finite  support  was  necessary 
to  allow  inclusion  of  the  autoregressive  tail  (1.2).  If  one  does  not 
want  to  add  this  tail,  then  there  is  no  reason  to  restrict  the  range 
of  S.  One  then  gets  a much  broader  class  of  models  for  which  (4.3) 
in  particular  holds,  although  the  model  is  still  a random  index  model. 
This  extended  model  is  not  as  broad  as  the  DARMA(1,N*-1)  model  in 
the  sense  that  one  cannot,  as  In  linear  models  (see  for  example, 
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Anderson,  l97o)  represent  the  tail  (1.3)  as  a random  index  model  in 
which  the  random  indices  for  each  n are  independent  random  variables. 


To  continue  with  the  example,  in  the  extended  moving  average 
model  let  S have  a geometric  distribution 


P{S=j)  = F(J)  = p(l-p)*^  j = 0,1,, 


1 - (l-p)« 


f(‘“)  = 


[1  - (l-p)e^  ] [l  - (l-p)e“  ] 


1 - Z p^(l-p)^'’ 


(1. 


P ^ 2p(l-p) 

(l-p)^  - 2(l-p)  cos  u>]  [1-  (l-p)^] 


The  initial  point  in  the  spectrum  is  related  to  the  amount  of 
long  term  dependence  there  is  in  the  process.  One  could  measure  this 
by  an  index  of  dispersion  (Cox  and  Lewis,  1966,  p.  7l) 


Var  (X  + • • • + X.  ) 
Jv  = ^ 

k{E[X]r 


lim  J = 2TT  — g f (0+) 
k-^co  E[xr 


For  the  moving  average  process,  f(0+)  = r—  [2  - F(k)^] 

fc  iT  K—  U 

which  takes  values  between  l/2ir  emd  ir.  To  compare  the  moving 

average  process  to  the  DAR(i)  we  note  that  from  (3-2)  for  the  DAR(l't  process 


f(o)  = ^ [ (1-p^)/  (1-p)^]  = ^ Ci^J  which  is  always  greater  than  1/2tt 
if  p > 0 and  increases  with  p to  infinity.  Note  that  f (0+) 
for  a sequence  of  independent  random  variables  is  1/2TT.  Thus  both 
the  moving  average  process  and  the  BAR  process  give  more  long  term 
dependence  than  a sequence  of  independent  identically  distributed 
random  variables.  The  DAR(i)  process  allows  more  long  term  dependence 
than  the  moving  average  process. 


4.2.  Joint  distributions  and  time  reversibility 

Unless  otherwise  indicated  we  will  restrict  our  attention  to 
the  DMA(1)  process  in  the  remainder  of  this  sectionj  that  is,  if 
a = P(S^=0),  then 


(4.8) 


f with  probability  a 

Y^  ^ with  probability  (l-a). 


In  this  case  = a(l-a)  and  = 0 for  J > 2. 

It  is  not  hard  to  calculate  the  Joint  Laplace-Stieltjes 

transforms  of  the  joint  distributions  of  random  variables  in  the 

-sX 

DMA(i)  sequence  but  it  is  tedious.  For  example,  if  y'(s)  = E[e  ■*■], 
then  from  (4.8) 


’*'2^®1’®2^  = E[exp{-s^X^-  s^X^)  ] 

= (l-a)  r(s^)  r(s^)  + a(i-a)  r(s^^+s2)  + a^r(s^)  r(s2^ 
= r(sj^)  r(s2)  [1  - a(i-a)]  + a(i-a)  r(sj^  + s^). 
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Similarly,  by  conditioning  argments  we  obtain 


iJ/^(Si,Sg,s^)  = E[exp  - SgXg  - s^X^}  ] 

= li-a)  r(sj^)  +•  a(i-a)  r(sj^  + s^)  r(s^) 

*-  o:^(i-a)  r(Sj^)  rlsg  + e^')  oP  ris^)  r(s^)  r(s^) 

= r (s, ) r (s  ) r (s, ) [ i - 2a(i-2 ) ] 

i 2 ^ 

[r(s^)  r(s2  + Sj)  + r(sj^  + s^)  rCs^)]  a/(i-a) 

and 

\^s^,S2,s^,Sj^)  = E[exp{-s^Xj^  - s^X^  - s^X^  - Sj^Xj^}] 

= (i-a)  r(s^)  tj(s^,Sg,s^)  + a(i-a)  r(sj^  + s^)  s^) 

+ a^(i-a^)  r(s^)  r(Sg+s^)  r(sj^)  cP(i-a')  r(s^)  r(s^)  r(s^^-s^) 
^ aV(sj^)  r(s^)  r(sj)  r(sj^) 

= r(sj^)  r(s2)  t(s^)  r (s]^)[i-3a(i-a)  + a^(i-a)  -cP(i  */] 

+ r (s^)  1(82)  r (sj+sj^)[a(l-a)  - (l-a)  + o!^(l-o:)  ] 

+ r(sj^)  r(sg  + s^)  r(sj^)  a(i-a) 

+ r(s^  + s^)  r(s^)  r(s^)  [a(i-a)  - a^(i-a)  ■*■  ot^(i-a)] 

+ r(s^  + Sg)  r(sj  + sj^)  a (i-a)"^ 

One  interest  in  the  joint  distributions  of  the  random  variables 
is  to  look  at  the  time  reversibility  of  the  process.  One  reason  for 
concern  with  time-reversibility  is  the  following.  The  EMA(i)  process 
(exponential  moving  average  of  order  l)  of  Lawrance  and  Lewis  (l9Tl ) 
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1 


I 


is  not  time  reversible  even  though  this  fact  cannot  be  determined 

from  second-order  properties  of  the  process.  Consequently  one 

cannot  distinguish  between  a and  (l-a)  in  the  spectrum  of  the 

EMAl  process.  However,  by  using  higher  order  moments  it  is  possible 

to  distinguish  between  a and  (l-a).  For  the  DMA(i)  process  the 

fact  that  p„, (l)  = a(l-a)  means  we  cannot  use  it  to  distinguish 
MA 

between  a and  (l-a).  The  time  reversibility  for  the  DMA(i)  process 

would  mean  that  we  might  not  be  able  to  distinguish  between  a and 

(l-a)  even  by  using  higher  order  moments. 

Since  'If  (s  ,s  ,...,s  ) = (If  (s  ,s  ,...,s  ) for  n = 2,3A> 
n 1 2 n n n n-l  i 

it  seems  likely  that  the  DMA(l)  process  is  time  reversible.  In  order 

to  show  time  reversibility  we  need  to  show  that  '|r^(sj^,S2, . . . ,s^)  = 

''^n^^n’^n  nonnegative  Sj^,...,  s^.  For  simplicity 

consider  the  terms  a = r(s,)  Tt(s„  + s_)  'r(s.  ) ...  r(s  ) and 

1 2 j 4 n 

b = r(s^)  r(s2)  ...  rCs^.j)  r(s^_g  + s^_^)  r(s^)  of  . . . ,sj. 

In  the  expression  for  ilf  (s,  ,s„,...,s  ) the  term  a has  a coefficient 
^ n 1 2 n 

of 

n 

Z P{Sj^=0  or  1,  82=0,  T>..*>Sj=l,  Sj^j^=0, . . . ,S^=0) 

0 3 

n 

= Z p{Sj^=o,  82=0,  Sj=  +1,  Sj^=  +l,...,Sj=  +1,  . . . , S^=0 } 

J ~3 

n 

+ Z P(S^=  +1,  o^^O,  ‘*’T,  ■*‘T> . . . >Sj=  +1,  Sj ^^=0, . . . , S^=0 } 

J ”3 

since  the  event  associated  with  the  term  a is  that  and  pick 

the  same  and  that  all  the  other  pick  distinct  Y^'s  from 

each  other.  Similarly,  the  term  b has  a coefficient  of 
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Since  the  are  independent  and  only  take  the  values  0 and  1 

and  the  independent,  the  coefficients  of  the  terms  a and  b 

of  'i/  (s,,s„,...,  s ) are  equal, 
n 1 2 n ^ 

Similar  arguments  can  be  used  to  show  that  the  coefficients 
of  the  terms 


of  ' ' ' ’^n^  equal  for  any  sequences  of  integers 

1 < < j‘2+1  < < • • * < < j^^^■l  < n 


and  k > 1.  Thus  (s, , • . . ,s  ) =4'  (s  , s.  ) for  all  n. 
— n i n n n 1 

4ence  the  DMA(l)  process  is  time  reversible. 


4.3>  Run  lengths  for  the  DMA(1)  process. 


We  will  now  consider  length  of  runs  for  a DMA(l)  process. 

Fix  a state  i in  E and  let  = inf{n  > l:X^^i)-l  the  length 
of  a run  of  i initiated  at  time  1 where  length  can  be  0,1,...  . 

We  will  first  compute  E[T.].  Let  a.  = 1 and  a = P{X,=X„=  •••  = X =i1 
for  n > 1.  Then 

a^  = P{X^=i] 

a^  = P{X^=X2=i)  = (l-a)  7r(i)aj^  + a(l-a)  '"'(i)aQ  + a 7r(i)^ 


and  by  induction 


(1^.9) 


E[T, 


= - y(i)[ 

I-P(i)  Tl^ 


1 + Q(1-(V)(1  - Tr(l) 
7t(1)][1  - a(l-a) 


7r  ( i ' ] 


where 

(4.10)  p(i)  = 7r(l)[l  + a(l-a)]  - v(l)^  a(l-a^ 

= 7r(i)  + a(l-a)  7;(i)[l  - 7r(i)]  . 


If  a is  either  0 or  1,  then  {X^}  Is  a sequence  of  Independent 
random  variables  and  E[T^]  = Tr(i)/(l-ir(i) ) as  expected.  Note  that 
E[T^]  > 7r(i) /[  1-71  (i)  ] for  0 < a < l;  that  is,  the  expected  length 
of  a run  of  i for  a DMA(l)  process  is  greater  than  the  expected 
length  of  a run  for  a sequence  of  independent  random  variables. 

For  a given  distribution  n,  the  maximum  value  for  E[ ] occurs  when 
a = 1/2.  In  this  case 


E[T^] 


[I  + 


We  now  turn  our  attention  to  the  generating  function  of  T^. 
Fix  j ^ i in  the  state  space  E and  let 


b = P{X,  = ...=X  =i,  X^=j],  n>l 

n ^ 1 n ' n+1  ' - 

and 

Oq  = P{X^=j)  . 

Using  an  induction  argument  we  obtain  for  n > 1 
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b = (1-a)  7r(i)  b + ofrr(i)  (l-a)b  + ap7r(i)^(l-a)b  , 


+ •••  + a°  ^7i(i)""^(l-a)bQ  + Q!'^7r(i)°  Qfrr(j) 


Z ^ z\  {z(l-a)ir(i)  + (l-a)z  T 

n=0  n=0  n=l 


+ or7r(j)  ^ a\(i)*^z" 
n=l 


After  some  simplification  we  obtain 


(4.11)  <t(z)  = Z z“P(T,=n}  

n=0  1- zTr(i)  - z t(1)[1-t(1)3  a(l-a) 


Note  that  for  a = 0 or  1,  $(z)=[l-Tr(i)]/[l-Z7r(i)  ] as  expected. 
Higher  order  moments  of  the  run  lengths  cem  be  obtained  from  (4.11), 


5.  Tlffl  BINARY  DAR?4A(l,l)  PR0CBS3 


In  this  section  we  will  consider  a ;iARMA(l, 1)  process  in 


which  X^  takes  only  the  values  0 

and  Ij  that  is. 

f 'n 

with  probability 

(5.1)  X^  = < 

A , 

, n-1, 

with  probability  (l-p) 

and 

/ 

A 

n-1 

with  probability 

p> 

A = 1 
n 1 

Y 

n 

with  probability 

(i-p^ 

where  {Y^j  is  a sequence  of  independent  random  variables  taking  the 
values  0 and  1 with  common  distribution  it.  Note  that  the  DARMA(l,l) 
process  is  not  Markovian  in  general. 

Time  series  of  binary  random  variables  are  of  particular 
importance  for  modelling  the  differential  counting  process  in 
discrete  time  point  processes.  Klotz  (1973)  Kanter  (1975) 
have  given  a model  which  is  different  from  the  binary  DAEMA(l, 1' 
process. 

Setting  N = 0 in  (2.8' -(2. 10)  we  obtain  the  correlation 
of  and  . ,111, 


.•^(j'  = corr(X^,X^^^)  = p‘^'^b(1-3)  (l-p'  ^ (i-S)^o] 
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The  spectrum  of  the  process  is  thus 


(5.2) 
where 

We  will  now  consider  some  properties  of  lengths  of  runs. 

For  fixed  i c {0,1},  let  = lnf{n  > /i}  - 1 as  before. 

We  will  calculate  E[Tj^]  and  the  generating  function  for  T^.  To 

begin,  note  that  although  {X^}  is  not  a Markov  chain,  n=l,2,...} 

is  a Markov  chain.  For  i,  i,  j,  k in  {0,1} 

Vi  = >'lv  V « ■ 

independent  of  Letting  denote  the  matrix  whose  (i,j)  entry 

is  Qj^(i,d)  we  have 

p(l-P)  + [l-o(l-p)]  7t(0)  (l-P)  (I-p)ti(I) 


^ and 


f(o3)  = ^ {1  + 2 Pj^(j)  cos(coj)} 


^ 2c(p,p)  [cos  CO  - p]  - 2p  cos  CO  p^-t] 

“ 27T  , ^ ^2 

L 1 - 2p  cos  tJD  + p 


c(p,p)  = 3(l-p)(l-p)  + (1-6)  p . 


Note  that  P{Tq  > nlA^  = i)  = Qq(1,0)  + qJ(1,1)  = Qo(i,E) 


for  i = 0,1.  Hence 


E[TplAQ=i]  = ^ Oj(i,E)  = RQ(i,E)  - 1 
n=l 

where  I^(i,  j ' =5^”^  QqCI-jJ)  with  I = being  the  identity  matrix 
and  R^CijE)  = RQ(i,0)  + R^Ci,!).  It  is  not  hard  to  show  that 


"o  ' 


where 


(1^) 


-1  1 

^ Lp(I-p)  7i( 


(i-e)(i-p)  7t(i) 

1 - p(l-P)  - [1  - p(l-p)]  Tj(0) 


A - det(l-Q^)  = [l-7r(0)]a-p(l-P)[l-M0)]-M0)[l-P(l-p)]}  • 


•:[Tq]  = tt(0)  Rq(0,E)  + tt(I)  Rq(1,E)  - 1 

TlMLI  - 6p  + 6{l-p-6+26p)fl  - 7r(0)l 


[l-7r(0:‘l{l-p(l-e)Il-3  7r(0)J  - 0t;(O}[1-3(1-p)]] 
7r(0){l+corr(X^,X^^j^)+pp-p}  + 7r(0)^  { -corr(X^,X^_^^ ; 


n'  n+1' 


l-p(l-P)  + 7r(0){-l+2p-5ep-corr(X  ,X  ..)]  + 7t(0)  {corr(X  ,X  4 2Po-o) 

n n+i  n n-*-l 

77-(0){corr(X^,X^^j^)+l-prf3p-7r(0Ucorr(X^,X^_i_^)  + 23p-p)) 


1 - it(0)  ]{1-p+3p-tt(0)  (corr(X^,X^j^)  + 2Pp-p) 


after  some  simplification. 

Similarly  one  can  show  that 


a(z)  = 1 - ZK)(l-fi)  + Z7i(0)  {-p-l+p(l-p)  ) 


+ 2^71  (0)ll-7r(0)]{-e(  1-3)  + 2p3(1-3)1  + z^7!(0)^3p  . 


After  some  manipulation  we  obtain 


l-zp(l-3)+z-fr(0){-3p-l+p(l-3))+z^Tr(0)[l-7r(0)J{-p(l-3)+2p3(l-3)}  + z^:,(C;^3.7 


In  a similar  manner  one  can  show  that 


6,  TELEPHONE  ERROR  DATA 


We  discuss  here  very  briefly  a case  in  which  the  binary  DARMA(l,l) 
model  may  be  of  use;  in  particular,  we  do  this  to  illustrate  some  of  the 
formulas  given  in  the  previous  section.  Another  recent  example  of  the 
need  for  discrete  time-series  models  is  Gaver,  Lavenberg  and  Price  (l976). 

Cox  and  Lewis  U966,  p.  173)  discussed  data  consisting  of  errors 

which  occurred  during  the  transmission  of  binary  data  over  a telephone 

line.  Let  X = 1 indicate  that  the  nth  transmitted  bit  is  in  error, 
n 

while  = 0 indicates  that  no  error  occurred.  Models  which  postulate 
that  bit  errors  occur  independently  or  according  to  a Markov  chain  such 
as  the  binary  DAR(i)  process  predict  that  the  runs  of  ones  and  runs  of 
zero  will  both  have  geometric  distributions.  But  the  runs  of  zeros  are 
just  the  intervals  (or  number  of  bits)  between  errors  without  the  times 
consisting  of  1 bit  between  errors,  and  these  were  shown  by  Berger  and 
Mandelbrot  (1%3)  and  Lewis  and  Cox  (l966)  to  be  nongeometric.  In  fact 
they  are  highly  skewed,  long-tailed  distributions  which  led  Berger  and 
Mandelbrot  (1963^  to  postulate  a model  in  ^ich  intervals  between  errors 
were  assumed  to  be  independent  with  Pareto-type  distributions. 

The  problem  with  the  Berger-Mandelbrot  model  was  that  the 
intervals  between  errors  were  found  to  be  dependent  (Lewis  and  Cox,  1966). 
Moreover  there  are  a disproportionate  number  of  1-bit  between  error 
intervals;  128  out  of  673  intervals,  while  the  longest  interval  between 
errors  is  85,993  bits.  This  suggests  that  modelling  the  binary 
process  may  be  a better  approach  than  modelling  the  intervals,  although 
the  modelling  process  must  be  nonMarkovian. 


The  binary  nARWA(l,l)  process  is  a candidate  process  for 
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modelling  this  process;  in  particular  one  would  like  to  know  whether 
the  runs  of  zeros  for  p between  zero  (Markovian)  and  one  (independent' 
produce  highly-skewed  run-length  distributions.  The  question  is  too  broad 
to  be  considered  here,  involving  also  estimation  of  suitable  values 
of  p and  D,  and  will  be  considered  elsewhere.  Here  we  will  examine 
only  the  effect  of  p on  EITq)  and  the  plausibility  of  the  model. 

For  the  error  data,  672  bits  out  of  1, IO6, l48  transmitted  were 
in  error,  so  that  we  can  estimate  7t(1)  as 

f(i)  = 1 - 7t(o)  = = 0.0006075  . 

Thus  from  (3.^)  with  p = 0 we  commute  that  the  expected  Lengths  of 
runs  of  zeros  and  ones,  given  that  they  occur,  are  respectively 
1/77(1)  = 1,645.09  and  l/ir(0)  = 1.000608;  the  observed  values  from  the 
data  are  1,911.27  and  1.235,  both  much  longer  than  predicted  under 
independence  assumption  (p  =0). 

In  Table  1 we  give  values  of  E(Tq)  computed  from  the  formula 

(5.3). 

Note  that  for  small  p,  e.g.  p =0,1,  the  value  of  E(Tq) 
first  increases  with  increasing  p,  and  then  decreases.  Tlcis  is  character- 
istic behavior  for  the  process  when  it  is  almost  a moving  average.  For 
large  p,  we  find  E(Tq)  decreasing  with  p.  In  particular  p has  a 
large  effect  on  E(Tq);  it  remains  to  be  seen  how  p effects  the  whole 
distribution  of  runs. 
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The  estimated  first  and  second  serial  correlation  coefficients 
for  the  data,  p(l)  and  p(,2*  are  O.lSO  and  0.121  respectively.  This 

t 

is  consistent  with  the  models  restriction  to  positive  serial  correlation 
coefficients.  From  the  expressions  for  ' given  in 

Section  5 we  see  that  p(2)/p(l)  = 0.6h  is  a rough  estimate  for  p) 
thus  p is  relatively  large  for  this  data.  With  the  proviso  that  p 
is  relatively  large  it  might  be  possible  to  find  unique  values  of  p 
and  p which,  with  the  estimated  7t(i),  would  make  (5 -3)  and  15-^) 
equal  to  the  estimated  E(Tq)  and  E(T^).  ifhis  is  net  possible  in 
general  since  the  expressions  (5-3)  and  (5-1+)  are  not  single- valued 
functions  of  p for  small  fixed  p. ) An  alternative  is  to  use  the 
estimate  of  p and  E(Tq),  with  Tr(,0)  and  pU),  in  (5.3  J and  solve 
for  p.  The  rough  estimate  obtained  this  way  is  p = O.Pa. 

It  appears  to  be  possible  to  estimate  p and  p in  a more 
systematic  wa.y  using  higher  order  joint  moments  of  the  X^,'s.  Ttiis 
vp;  II  be  discussed  elsewhere. 
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